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If you're a system administrator or Web- 
master for one of the Fortune 1000 compa 
nies in the world, the cost of a powerful 
search engine tool for your Internet /intranet 
Web sites may not be a major concern. But if 
you're a bit smaller, or you're concerned 
about paying a vendor based on the number 
of documents you index, you might want to 
read on. 

Recently, I ran into a search engine tool for 
Web sites at the University of Arizona that 
caught my attention. The search engine they 
created is called Webglimpse, and I've been 
impressed by what I've seen. Webglimpse is 
used by Web sites around the world for their 
search needs. This includes or- 
ganizations like the American 
Cancer Society (www.cancer. 
org/) shown in Figure A. 




might be interested in. Glimpse is a command- 
line tool for Solaris, also provided by the Uni- 
versity of Arizona. Glimpse lets you perform 
lightning-fast searches from the Solaris com- 
mand line, and Webglimpse extends that capa- 
bility to the Web. 

To Web site visitors, your Web pages can 
contain search forms that look similar to 
AltaVista or Excite forms. Just design your 
HTML pages as you normally would, and 
then add a variation of the Webglimpse 
search form wherever you want it. If you're 
interested in creating a search engine that 
spans multiple Web sites, Webglimpse also 
offers the ability to traverse remote Web sites. 



What's Webglimpse? 

Webglimpse is a search engine 
for Internet, intranet, and 
extranet Web sites. It offers 
some of the same search 
engine features you'll find at 
Internet sites like AltaVista 
and Excite — at a fraction of 
the cost of these well-known 
commercial programs. 

Webglimpse is (primarily) a 
Perl/ CGI program that accepts 
input from HTML forms. It 
interacts with Glimpse indexes 
to locate the documents you 
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Hope. Progress. Answers. 



For information on a specific topic, please 
use our search engine below; or, for a 
general overview, browse our "Quick 
Index" . If you would like to talk to someone 
who can help, call 1-800 -ACS -2345. 



Search for: 



Simple Search ® 

Maximum number of files returned: 



Explicit Search O 



Search The Site 



Figure A: Here's Webglimpse in action at the American Cancer Society. 
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Features 

In addition to its low price, I find several 
things about Webglimpse attractive. Its special 
features include the following: 

• Webglimpse can account for misspellings 
(up to two characters) in your searches. 

• When Webglimpse finds documents that 
match your search, it can provide hyper- 
links directly to the lines of returned docu- 
ments that match your search pattern. 

• As an administrator, you can specify the 
names of documents to exclude from 
searches (for example, documents like *.old 
or * . bak). 

• Because you have the source code for the 
CGI program, you can modify the format of 
your output pages. 

• You can specify the maximum number of 
items returned. 

• You can quickly add search boxes to all of 
the Web pages in a directory tree. 

• You can specify the maximum number of 
characters printed per match. 

• It will return only recent files. 

• It's configurable for multiple domains on 
one server. 

Webglimpse also offers several traditional 
search engine features: 

• Traditional keyword searching 

• Partial word matches 

• Case-sensitive searching 

• The use of Boolean queries (AND, OR, NOT 
word combinations) 

Limitations 

While Webglimpse does many things very 
well, it doesn't provide a relevancy factor, or 
weighting, of the pages it returns. Most search 
engines, like Infoseek, return from a search 
and rank documents from 100 percent relevant 
to 1 percent, and return the documents in that 
order. (Of course, how they determine what's 
relevant is another story.) Webglimpse doesn't 



offer that capability. From what I've seen, this 
appears to be its most severe limitation. 

Having said that, this limitation is partially 
overcome by the way Webglimpse prints the 
lines from the documents that match your 
search criteria. While traditional search 
engines print a paragraph from a document's 
"description" META tag, Webglimpse prints the 
exact lines that are matched by your search 
criteria. 

With this feature, you can see the matching 
lines in the document, and determine the rele- 
vancy of the match yourself. It just depends on 
whether you prefer to see weighting factors or 
the actual document text. 

Requirements 

Webglimpse has a few requirements that you 
should be aware of before proceeding with an 
installation. First, Webglimpse is primarily a 
Perl program, so you'll need to have Perl 
installed, specifically Perl 5.0 or newer. 

Second, Webglimpse requires that Glimpse be 
installed first. Specifically, Webglimpse 1.6 re- 
quires Glimpse v4.1 or newer to be installed first. 

Third, you'll need a C compiler — but only 
during the installation. When you install 
Webglimpse, it compiles a couple of small pro- 
grams based on the environment settings you 
provide. The standard C compiler or GNU 
compiler should work just fine. 

Downloading and installing 

To use Webglimpse, you'll need to download 
it, install it, and then configure your first 
archive. Webglimpse can be downloaded from 
these URLs: 

• http://tucson.com/webglimpse 

• http://donkey.cs.arizona.edu/ 
webglimpse/ 

The tucson.com URL is now the preferred 
URL, because the Internet Workshop group 
provides support for the product. Just look for 
the Webglimpse download section, where you 
can download the tarred and gzipped file. 

After I downloaded the software, the instal- 
lation was relatively easy, but there were two 
problems worth mentioning. First, make sure 
you meet the mentioned requirements. These 
requirements aren't explicitly stated in the 
installation documentation. 
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Second, the Ma kef i le script for Solaris 
assumes that you'll be using GNU's gcc com- 
piler on a Solaris platform. If you're using 
Sun's cc compiler, you'll want to edit the 
Ma kef i le before running wginstall, or else the 
install program will bomb and you'll need to 
start over. 

Everything else seemed to follow the installa- 
tion instructions closely. When you're finished, 
Webglimpse will be installed in a directory such 
as / usr/ local/ webglimpse, and a few utility 
programs will be installed in /usr /local /bin. 

In summary, the installation procedure 
could be a bit smoother, but it's already much 
better than the earlier procedure for Web- 
glimpse 1.5. For me, the installation was the 
hardest part about using Webglimpse. 

Creating a local archive 

Once you have Webglimpse installed, you're 
ready to create your first archive. You can cre- 
ate this archive when you're prompted at the 
end of the installation procedure, or wait until 
a later time. For the purposes of this article, I 
suggest that you create your first archive in 
just a few moments. 

Any time you want to create a Webglimpse 
archive, you'll run a Perl program named 
con fare (short for configure archive), which is 
located in the Webglimpse directory /usr/ 
local/ webglimpse, by default. The con fare pro- 
gram prompts you with a series of questions, 
and creates an archive based on your replies. It 
also generates a sample HTML search page that 
you can use to test your archive, and include in 
your own HTML documents. 

For the remainder of the discussion, I'll 
need to refer to a sample directory and a sam- 
ple URL on a Web server. The sample direc- 
tory I'll refer to is the Apache Web server's 
HTML user's manual directory: 

/ usr / local / lib / apache / htdocs / manual 

I'll use these manuals as an example of a group 
of HTML documents that you might want to 
index for later searching. On my intranet, the 
URL for the manual directory is: 

http:/ / intranet.missiondata.com/manual/ 

With this as a sample directory, we're ready to 
proceed. Move to this directory, and run the 
con fa re program: 

cd / usr / local / lib / apache / htdocs / manual 
/ usr / local / webglimpse / conf arc 



The conf arc command prompts you with 
several questions you need to be prepared to 
answer. In this first example, we'll just create 
an index of local files, so we'll only need to 
answer these four questions: 

1. In what directory do you want to store the 
Webglimpse-generated index files? 

2. What title do you want to put on the 
archive? 

3. Do you want to index by (D) directory or 
(T) traverse URLs? 

4. What's the full path of the directory to be 
indexed? (Must be accessible from the Web.) 

For the purposes of my intranet server and 
the manual directory, the answers for these 
four questions are: 

1. / usr / local / lib / apache / htdocs / manual 

2. Index of the Apache manuals 

3. d 

4. / usr / local / lib / apache / htdocs / manual 

As fair warning, be careful not to select f in 
step three. If you enter f, things start to get 
very interesting, because this gives you the 
power to index remote and Web servers, 
which we don't want quite yet. 

Once you've answered these questions, 
conf arc builds the index files you need. In fact, 
it creates a large number of files, with most 
files matching .wg*, .g*, and wg* metacharacter 
patterns. The process also creates another file 
named archive.cfg, and a directory named 
.remote (for indexing remote sites). Out of all 
these configuration files, Table A on page 4, 
shows a list of the most important files gener- 
ated by conf arc, with brief descriptions of 
each file. 

The conf arc procedure also generates two 
sample HTML search forms you can use, 
named wgindex.html, and wgall.html. These files 
will also be located in the manual directory (or 
whatever directory you chose to index). 

The first HTML page, wgindex.html, 
includes the standard Webglimpse search 
form. It lets you specify: 

• The text string(s) to search for 

• Whether the search should be case-sensitive 



www.zdjournals.com/sun 



April 1999 



3 



• Whether you want to match partial words 

• Whether you want to jump right to the lines 
in the files matching your search 



The second page, wgall.html, lets you add a 
little more precision to the search, because it 
lets you specify the directory tree you want to 
search via a dropdown scrolling list. This can 
be very good for power users who know 



Table A: The most important files generated by the confarc program 



File 


Description 


archive, cfg 


Defines the archive configuration param- 
eters for Webglimpse. Built from the 
questions asked by confarc. 


.wgfilter-index 


Defines the files that you do and don't want 
to index. 


.wg_err 


A list of error messages. 


wgjog 


A list of all collected files and URLs, 
including files indexed and excluded. 


Wgreindex 


An auto-generated command file that can be 
used to build the archive and index. You can 
put this command in your cron file to rebuild 
the index periodically. 


.wgsiteconf 


Configuration information about your 
Web site. 
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Figure B: The file wgindex.html contains the standard Webglimpse 
search form. 



what's in each directory tree, but might not be 
good for people that don't know where the 
desirable documents may be located. 

Testing Webglimpse 

Once you've generated the index files and 
search forms with confarc, all you need to do 
to test the installation is point your Web 
browser to the proper URL. In the case of my 
server, the proper URL is: 

http:/ / intranet.missiondata.com/manual/ 
wgindex.html 

When you point your browser at this URL, 
you'll see a form similar to the one shown in 
Figure B. To test the installation, just type in 
the name of a word that should be in the 
archive and click the Submit button. In the 
case of the Apache online manuals, I searched 
for strings like srm.conf and DocumentRoot. (If 
you're interested, I also recommend selecting 
the Jump To Line option.) 

If everything works okay, your search results 
will be displayed in the next page. You'll see 
results similar to those shown in Figure C. 

As you can see from Figure C, turning on 
the Jump To Line feature of Webglimpse 
makes the search even more powerful. If you 
see a line you're interested in, you can just 
click that line. Webglimpse takes you directly 
to the line inside the document you select. 

Power searching 

The options provided on the auto-generated 
HTML forms (case-sensitive, partial match, 
etc.) are very powerful, and for the most part, 
self-explanatory. If you don't know what they 
mean at first, you'll learn very quickly with a 
little experimentation. 

Some of the other search features that 
Webglimpse offers aren't immediately obvi- 
ous. With Webglimpse, I've found that you 
can combine words in your searches using the 
Glimpse command-line syntax. 

For instance, if you want to perform 
searches on a combination of words, using a 
Boolean AND operation, just connect your 
search keywords with a semi-colon (;). 

As an example, if you want to search for 
documents containing the strings srm.conf 
and DocumentRoot, just enter this syntax in the 
search box: 

srm.conf .DocumentRoot 
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Note: Be careful not to include spaces in the 
search pattern that you don't really want. 
Spaces will affect your search results. 

If, instead, you want to search for documents 
containing the strings srm.conf OR 
DocumentRoot, just use the comma (,) instead to 
separate the words: 

srm.conf .DocumentRoot 

I really like these features, and I'd recommend 
including references to them on your search 
forms. I suppose it's probably more standard 
to support keywords like AND and OR in 
search engine tools. If you're a sharp Perl pro- 
grammer and you're interested in supporting 
these words instead, it won't take much to 
code this change, if you're interested. I suspect 
the comma and semi-colon characters are left 
the way they are to give the end user more 
precision in their searches. 

Running a periodic index 
with crontab 

If you have an active Internet or intranet 
Web site, you'll probably want to run the 
index program, wgreindex, fairly often, so 
new files will be added to the index automat- 
ically. You can easily set this up by adding a 
line to your crontab file for each index you 
want to regenerate. 

For my purposes, a line similar to this 
works just fine: 

55 11 * * * /usr/local/tib/apache/ 
^htdocs/manua l/wgrei ndex -q > /dev/ 
*-null 2>&1 

If you prefer to retain the wgreindex output, 
just redirect the output streams to your 
desired output files. 

Indexing remote Web sites 

If you're interested in creating a multi-site 
search engine, one of the great things about 
Webglimpse is the ability to index documents 
contained on other Web servers. This capabil- 
ity can quickly turn your Web server into a 
small version of AltaVista or Excite. (From a 
technical perspective, I don't refer to Yahoo!. 
Technically, most of what Yahoo! offers is a 
directory, which is different from a document- 
based search engine.) 



Before attempting to index any remote sites 
with Webglimpse, I suggest that you first print 
the support document titled "Configuring an 
archive." This document contains a descrip- 
tion of each question you'll be prompted for, 
and it will be very helpful. 

Also, you may need to create a document 
on your local Web server that contains hyper- 
links to the remote sites you want to index. 
For instance, let's assume that you want to 
index two remote sites named coyote. acme, com 
and bugs.bunny.com. In this case, you'll need to 
create a document on your server with entries 
similar to these: 

<a href ="ht tp : //coyote . acme . com">coyote</a> 
<a href ="http: //bugs. bunny. com">bugs</a> 

If you don't already have a document with 
remote links like this, you should create it, and 
save it with a suitable name — perhaps some- 
thing like remote-sites. html. For lack of a better 
term, you're going to feed this document to 
conf arc in just a few moments. As a final note, 
be sure to save this document in a location 
where it can be accessed via a valid URL. I 
saved this document in a directory named 
/ usr / local / lib / apache / htdocs / remote. 

With this feeder document in place, you're 
now ready to begin the process of indexing 
these remote Web sites. If you're ready, move 
to the directory containing your feeder docu- 
ment, and start the conf arc program just as you 
did before: 
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Figure C: The search results page demonstrates the Jump To Line feature 
that Webglimpse provides. 
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cd / usr / local / lib / apache / htdocs / remote / 
usr / local / webglimpse / conf arc 

Answer conf arc's prompts as usual, until it 
asks, "Do you allow traversal of remote pages?" 
Previously you answered n at this point, but 
now you want to answer y. 

With this reply, conf arc begins prompting 
you with a series of seven additional questions. 
This is where it helps to have the printed docu- 
ment, "Configuring an archive," in hand. 

I'll skip the discussion of the first six prompts, 
because their meaning is fairly well-defined in 
the "Configuring an archive" document. 

The seventh prompt looks like this: 

Now you'll need to enter the URL(s) of the 
file(s) you'd like to traverse: 

This is where you should supply the URL to 
your remote-sites.html feeder document. For 
my setup, the proper URL is: 

http: / / intranet.missiondata.com/remote / 
remote-sites.html 

At this point, Webglimpse prompts you for 
the next URL. Here you can keep entering URLs 
until you're finished, or hit [Enter] when you 
have no more URLs to provide. If the remote- 
sites.html document is the only document you 
want to refer to, just hit [Enter]. 

Now, conf arc will begin its indexing process. 
Depending on the URLs you provided, the 
options you selected, and the speed of your 
network, this can be fairly quick, or quite 
lengthy. 

When conf arc finishes, you can test the 
archive just as you did before, by pointing your 
browser to the proper URL on your server: 

http: / / intranet.missiondata.com/remote / 
wgindex.html 



Then, enter a few keywords to search for. If 
everything succeeded, you should be able to 
find keywords in remote documents just as eas- 
ily as you find them in your local documents. 

Cost issues and other free 
(or nearly free) search engines 

The price of Webglimpse currently varies from 
free to the high price of $2,000, based on your 
use. For government and educational use, it's 
free; for small businesses, it's available for as 
little as $200. At the high end, the largest busi- 
nesses will be charged $2,000, much less than 
the cost of well-known search engine prod- 
ucts. I'd suggest looking through the Web- 
glimpse URLs provided earlier for pricing for 
your intended environment. 

If you're on a tight budget but interested in 
incorporating a search engine into your Web 
sites, and Webglimpse doesn't meet your 
needs, here are two other search tools that are 
free or nearly free: 

• Webinator, from Thunderstone: 
www.thunderstone.com/webinator 

(free for small sites) 

• ht: / / Dig, located at www.htdig.org 

I haven't personally used these products, but 
noticed their pricing strategies while research- 
ing this article. 

Conclusion 

With Webglimpse, you can add a fast, low-cost, 
document-searching engine to your Internet or 
intranet Web servers. Webglimpse can be used 
to serve documents from your local Web site, 
as well as remote servers, so you can quickly 
start your own search engine site and compete 
with AltaVista, Excite, and others. 
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Questions? Comments? 

T s your Solaris box singing the blues? Poor performance got you down? Ping 
i- the Solaris Dude today! Send your questions to robt@ cymru.com. See you 
next month and keep those questions coming! 
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Lexical and text 



by Paul A. Watters 

Every day we seem to find situations that 
require manipulating text. This may be 
as simple as extracting email addresses 
from messages to massaging data that needs 
to be fed to a supercomputer for analysis. In 
this article, we'll look at utilities to make text 
processing on Solaris easier. 

The processing of text and lexical informa- 
tion was one of the earliest non-numerical 
uses for digital computers, and has continued 
to be one of the most popular. But text analy- 
sis goes beyond simple word processing, espe- 
cially with the preparation of text for some 
kind of data analysis (that is, database entries 
or statistical observations). Fortunately, Solaris 
is well equipped to assist many text processing 
applications. Rather than re-inventing the 
wheel for text processing, an optimal approach 
is to combine custom tools with existing utili- 
ties on the command line to perform complex 
lexical tasks. 

Text utilities 

We'll begin by looking at some of the basic 
text processing utilities available under 
Solaris. We're all familiar with basic com- 
mands like cat, which sends the contents of a 
file to standard output, or more, which presents 
the display of standard output in readable 
segments. These commands can be combined 
on the command line with a pipe ( I ) to 
increase their functionality: 

cat filename ! more 

The same results could be achieved by using 
the redirection operator (<) to retrieve the con- 
tents of the file, instead of using cat: 

more < f i lename 

However, if fi lename was a list of simple 
database entries from a pet shop, which either 
contained the field bird or animal, a more 
complex query could be formed to retrieve all 
the records for animal page by page, by incor- 
porating the grep command: 




If we only wanted to sample the first few lines 
of f i lename (which might be the case if a non- 
exhaustive search is being performed), we 
could use the head utility: 

head f i lename ! grep "animal" 

Alternatively, if the database was sorted by 
animal, and we were interested in finding the 
number of maltese terriers, we could use the 
un i q utility (with the -c flag). This will print the 
first occurrence of a unique line of text in a file 
with respect to adjacent lines, and count the 
number of times it occurs: 

uniq -c fi lename 

If we simply want to count the number of 
lines and the total number of characters, we 
can use the wc utility: 

wc f i lename 

If the database fields were unsorted, there's a 
very useful utility called sort that can sort in 
dictionary order, or can convert lower case to 
upper case and sort, and even has a built-in 
mechanism to sort by three-letter date codes. 



Simple tools 



We examine a simple application that com- 
bines the text processing utilities outlined 
above with several small utilities written in C. 
The task is to process a text document and cre- 
ate a numerical representation of the data 
that's suitable for statistical analysis. This in- 
volves creating an index of unique terms in the 
document, which are then substituted for each 
word in the original document to form a set of 
numerical vectors that can be read into a statis- 
tical package. This situation arises quite often 
in information retrieval applications, where a 
database of unique definitions and terms is 
created from a set of documents. The first cus- 
tom program, listwords, reads in a document 
(input.txt) and creates a text file (output.txt) 
containing all the words in the document 



cat filename I grep "animal" I more 



listwords input.txt output.txt 5000 
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Next, we sort the output.txt file and generate a COIIiplGX tOOlS 

list of ""^"q q„4-*;™. 
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: unique entries: 



sort -u output.txt > unique.txt 

At this point, non-words could be excluded 
by using the spel I utility, if we were only inter- 
ested in processing words found in the dic- 
tionary. We then verify the number of unique 
entries in the list by using the wc utility: 

wc unique. txt 

The result (in this case, 500) is then used as 
a parameter for the second custom program, 
createindex, which simply appends an index 
to each entry in the file: 

createindex unique.txt index.txt 500 

The unique numerical types defined by the 
index are then inserted into a corresponding 
position, with an output file generated by the 
third custom program: 

tokenise index.txt input.txt output.txt 

Although separate programs or functions 
could have been written as replacements for wc 
and sort, the great advantage of using sepa- 
rate command-line utilities is that a new pro- 
gram doesn't have to be written to perform 
slightly different text processing tasks. In 
addition, new utilities can be written to extend 
the functionality of those included in the 
Solaris distribution. The flexibility of these 
utilities can also be combined with C-shell 
commands, such as for each, to process many 
files at one time. This is a particularly impor- 
tant feature that limits graphical-user interface 
text processing systems. 



Sorting and indexing are just the beginning of 
processing text with Solaris. There are more 
complex tools for generating lexical analyzers 
and parsers (for example, lex and yacc). Lex is 
best used for simple pattern matching tasks, 
and can also be used to assign unique types to 
words, as we've demonstrated, using several 
different utilities. While Lex is more compact 
than using several utilities to perform the 
same task, it has the drawback of a difficult 
syntax consisting of definitions, rules, and 
routines, which is less intuitive than dividing 
a complex task between dedicated utilities. 
Yacc also has similar problems, but nicely han- 
dles the processing of complex lists by using 
recursion. Awk is yet another useful pattern- 
scanning utility, for which associations can be 
made between a regular expression (that is, 
using Boolean operators) and some action 
(that is, splitting and truncating text). 

Conclusion 

Solaris has many tools available for text pro- 
cessing, from simple command-line utilities 
that can sort and process text information from 
databases, to sophisticated lexical analysis pro- 
grams. A suite of tools can be constructed to 
handle many different scenarios without hav- 
ing to rewrite a new program for each task. Lex 
and Yacc aren't for the faint-hearted, but worth 
learning if routines are required for resolving 
ambiguities using precedence rules. 

Further reading 

As usual, the O'Reilly series of books has a 
useful manual for Lex and Yacc by John 
Levine. The custom utilities described above 
can be obtained via email from the author. ^ 



Tool of the Month: Islk 

The tool of the month is lslk, or list-locks. 
This tool, created by Vic Abell, is a great 
way to list all the system locks. Ever wonder 
what has a file locked up and just won't let 
go? With lslk, you'll know! 

lslk displays the inode and filesystem of 
the locked files. With the wealth of options 
to lslk, you can avoid kernel deadlocks and 
customize the search parameters, lslk also 



tracks NFS-based locks. One downside: The 
VxFS file system isn't supported. 

You can obtain lslk at the following URL 

ftp://vic.cc.purdue.edu/pub/tools/unix/lslk/ 

I don't warranty this tool or its use. How- 
ever, should you encounter any problems 
porting or using the tool, don't hesitate to 
drop me a line. 
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Sun/UNIX editors 




by Werner Klauser 

There are several different editors avail- 
able on the UNIX systems. Some of 
these editors are designed to run under 
X- Windows, while others will run on any ter- 
minal. This article explains how to open each 
editor, make a few changes, and then exit the 
editor. You can consult the editor's man pages 
for additional information. All of these pack- 
ages are full screen editors. Each can be 
invoked by typing 

$ edi t myFi le. txt 

where edit is the editor that you want to use 
and myFile.txt is the file that you want to edit. 
When you are using X- Windows, you can se- 
lect many of these editors from the Editors I 
Word-processing submenu that pops up when 
you right-click. 

Character-based editors 

While it seems that the world has gone com- 
pletely graphical, there still is a place for 
character-based editors. Following are some 
of the more popular editors available for Solaris. 

vi 

The vi editor is the basic all-purpose text editor. 
There are two modes in vi: one for entering or 
changing the file and the other for command 
execution. You begin the session in Command 
mode. To switch to Editing mode, you can type 
i to insert before the current mark, a to append 
after the current mark, and x to delete a charac- 
ter. If you precede x with a number, then that 
number of characters is deleted, starting with 
the current mark. To switch back to Command 
mode, press [Esc]. In Command mode, use the 
arrow keys to move around until you find 
something that needs to be changed or added; 
then type i, a, or x to make changes. You can 
also save your file while in Command mode 
with :w. If you want to quit, you can use ZZ to 
save the current file and quit or :q! to quit with- 
out saving. 

emacs 

The emacs editor is much more powerful than 
vi. When running under X- Windows, you can 
use the mouse to place the insertion point any- 



where in the window; otherwise, just use the 
arrow keys. Along the bottom of the emacs 
window is a status bar, which provides you 
with some useful feedback. Many emacs com- 
mands require two key strokes. If you acciden- 
tally begin a command, you can press [Ctrl]G 
to cancel it. 

There's a helpful tutorial that you can access 
by pressing [Ctrl]HT. You can save your work 
by pressing [Ctrl]X [Ctrl]S. To quit emacs, press 
[Ctrl]X [CtrljCL. 

If you haven't saved changes, then you'll be 
prompted to do so. You can also cut and paste 
portions of text easily with emacs. To do this, 
you first set a mark at one end of the text that 
you want to cut by pressing [Ctrl]2. Then you 
move to the other end of the text you're cut- 
ting and press [Ctrl]W. To paste the block of 
text that you've just cut, press [Ctrl]Y. Every 
time you press [CtrljY, a copy of the text that 
you cut is inserted. 

pico 

The pico editor, like emacs, can be used from 
either a terminal or X- Windows. To make 
changes to a file, use the arrow keys to move 
around. Once you've gotten to the place where 
you want to insert something, start typing. 
You can delete text with the [Backspace] or 
[Delete] keys. The different commands that 
pico supports are displayed along the bottom 
of the window, each preceded with a caret ( A ). 
The caret indicates that you should hold down 
the [Ctrl] key while pressing the command 
key. For instance, the exit command is A X — this 
is equivalent to [Ctrl]X. 

If you're looking for a simple but powerful 
editor to learn, then pico is probably the one for 
you. It has a helpful interface and can be used 
from both X- Windows and a regular terminal. 

editor 

Like pico, editor has a good interface and 
works well from a terminal. You can use the 
arrow keys to move the insertion point and 
the [Backspace] or [Delete] keys to delete text. 
The supported commands are listed in the top 
half of the window and have a syntax similar 
to pico's. There are some two-keystroke com- 
mands. For instance, to move to the top of a 
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file, the command is "KU, which is equivalent 
to [Ctrl]KU. The important commands are all 
displayed, so you shouldn't have any trouble 
learning this package. 

X-Windows editors 

Following are some of the more popular X- 
Windows editors available for Solaris. 

textedit 

The textedit editor is an X- Windows-only edi- 
tor. Like the other X- Windows editors, textedit 
has a helpful graphical user interface. Once 
you've started it, you can make changes to 
your file, textedit has mouse support so you 
can use the mouse to place your cursor and 
scroll through your document. To exit textedit, 
right-click on the window's title bar. This will 
bring up a menu for that window. Move the 
mouse down to the Destroy Window option 
to quit. Be careful not to select Exit Window 
Manager from the menu, as this would log 
you off of the machine, in addition to quitting 
textedit. 

aXe 

The aXe editor is also an X- Windows editor. 
Although the other packages support similar 
features, the graphical user interface makes aXe 



easy to learn. After starting it, you should see 
two windows — one for your document and one 
with aXe written in the middle of it. Your docu- 
ment window contains the text of your current 
document, while the aXe window provides 
easy access to the Online Help feature. Simply 
click on the Help button to get more help. 

Like the other packages, you can use the 
arrow keys to move around. aXe also supports 
searching and cutting and pasting. To cut a 
block of text, select the text with your mouse 
(the selected text will be highlighted). Now, 
from the Delete menu, select Cut. This stores 
the selected text in a buffer. To paste the text 
that you just cut, select Insert I Paste. You can 
save your changes and /or quit using the dif- 
ferent options in the Quit menu. 

xedit 

The xedit editor is another X- Windows editor. 
It isn't as powerful as aXe, but the interface is 
easier to learn because of its simplicity. You 
can cut and copy as you would in the UNIX 
shell using the mouse buttons. For instance, 
select the text that you want to copy, and 
move to the place where you want to insert 
the text. Press the middle mouse button to 
insert the text. There are buttons across the top 
of the window for saving changes, loading 
files, and quitting, 




PARC OpenCt and 
the Forth language 



by Boris Loza 

From time to time as a UNIX system ad- 
ministrator, I've had to work in the Sol- 
aris OpenBoot environment. If s useful 
for booting the operating system (boot -r, boot 
cdrom -s, etc.), modifying system start-up config- 
uration parameters (input-device, output-device, 
setenv, etc.), troubleshooting (probe- scsi-a 1 1, 
show-devs, etc.), or running diagnostics (test net, 
test /memory, etc.). But sometimes, it isn't enough 
to use predefined commands and utilities. For 
this purpose, OpenBoot provides a very power- 
ful environment based on the ANS Forth pro- 
gramming language. 



Some Forth history 

The name Forth was intended to suggest soft- 
ware for the fourth (next) generation comput- 
ers, which Charles Moor (the programmer 
who invented it) saw as being characterized 
by distributed, small computers. The operat- 
ing system he used at the time restricted file- 
names to five characters, so U was discarded. 
The first Forth interpreter was written in 1968. 
For the next five years, Forth was imple- 
mented on various CPUs and became widely 
known because of its high performance and 
economy of memory. 
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In 1988 Sun Microsystems invented Open 
Firmware technology — hardware-independent 
boot code, firmware, and device drivers. Open 
Firmware, then called OpenBoot, allows one 
version of the Boot ROM to run on any config- 
uration of hardware and software. Such tech- 
nology uses Forth as the official language. 

Why Forth for all this? 

Forth is a stack-based, extensible language 
without type-checking. It uses "reverse Pol- 
ish" (postfix) arithmetic notation, familiar to 
users of HP calculators. To add two numbers 
in Forth, you'd type 2 3 + instead of 2 + 3. If 
you're using Forth, you don't need to recom- 
pile your program to add new functionality. 
You can define a new command and it will 
instantly be available for you to use. Because 
of this, the Forth compiler is simpler, smaller, 
and faster than other compilers. So the interac- 
tive Forth system, including an editor, assem- 
bler, and even multitasking support, can easily 
be put in an 8K EPROM! 

The Forth language 

The fundamental program unit in Forth is the 
word — a named data item, subroutine, or op- 
erator. Actually OpenBoot commands such as 
boot, printenv, and probe-scsi-all are Forth- 
defined words. Programming in Forth con- 
sists of defining new words in terms of an 
existing one. 

You can start programming in Forth at the 
OpenBoot ok prompt (ok is the usual prompt 
in Forth): 

ok : average ( a b - avg ) + 2/ . ; 
ok 10 20 average 
ok 15 

ok : .ASCII ( end start — , 
dump characters ) 
do 

cr i . i emit \ Print ASCII 
characters 

loop ; 
ok 70 65 .ASCII 
ABCDEF 
ok 

OpenBoot 3.x contains about 2,450 Forth 
words. All words belong to the dictionary that 
also contains vocabularies (consisting of 
related words and variables). 

The new commands created above would 
be lost after rebooting a machine. OpenBoot 



provides a way to prevent this by saving into 
NVRAM using nvedit: 

ok nvedit 

0: : hello ( — ) cr 

1: ." Welcome to OpenBoot!" cr ; 

2: X 

ok nvstore 

ok setenv use-nvramrc? true 

ok reset-all 

ok hello 

Welcome to OpenBoot! 
ok 

By creating customized scripts, you can mod- 
ify the OpenBoot start-up sequence. Unfortun- 
ately, you can't use the following commands 
here: boot, go, nvedit, password, reset-all, and 
setenv securi ty-mode. OpenBoot provides vari- 
ous facilities for debugging Forth programs and 
loading and executing programs written in Forth 
from Ethernet, a hard disk, or a floppy device. 

One of the Forth utilities that we'd like to 
mention here is the built-in Forth language 
decompiler — see. It can be used to re-create 
the source code for any previously defined 
Forth word. For instance: 

ok see scan-subtree 

scan-subtree 
['] scan-subtree guarded-execute drop 

ok see probe-scasi-all 

(ffd60c5c) ['] (ffd88b08) scan-subtree 

The preceding listing shows that scan-sub- 
tree is composed only of Forth source words 
that were compiled as external or as headers 
with f code-debug? set to true, probe-scsi-al I is a 
different word. It also contains words that were 
compiled as headerless and are, consequently, 
displayed as hex addresses surrounded by 
parentheses. For more information on how to 
use Forth development tools, consult the Open- 
Boot reference manual. 

Forth and shell scripting 

On the Internet, you can find various share- 
ware ANS Forth compilers for a number of 
Operating Systems. One of the most interest- 
ing is pForth (a portable ANS stile Forth). 
After compiling and linking it on /usr/ local/ 
bin/ forth, you can run standalone scripts like 
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Listing A: A simple Forth script that can run on Solaris 

#!/usr/local/bin/forth 
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ 
Enter a number: 

1 . Item number one 

2. Item number two 

3. Item number three 

0. Exit 

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ 
get-menu-item ( — n ) \ Query user for a menu option, 
begi n 

." Enter a number: " cr 
." 1 . Item number one" cr 
." 2. Item number two" cr 
." 3. Item number three" cr cr 
." 0. Exit" cr 

key dup emit cr ascii 0 - dup 0 3 between not 
whi le 
drop 

." Not a valid menu number! Try again." cr cr 
repeat 

case ( case-value — ) 

1 of ." This is number one" endof 

2 of ." This is number two" endof 

3 of ." This is number three" endof 
0 of ." Exit!" bye endof 

endcase ; 



get-menu-i tern 



the one shown in Listing A. To run this pro- 
gram, just type the name of the file. 

get-menu-i tern 

In our opinion, Forth cannot replace Perl 
and UNIX shell programming facilities for Sys- 
tem Administration needs. These languages 
are specially designed for parsing strings and 
I/O manipulation. But you can practice with 
Forth scripting in order to be more comfortable 
with creating power tools in the OpenBoot 
environment, 
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un-provided tool to test 
Year 2000-compliance 



by Werner Klauser 

The millennium is fast approaching and 
there's a growing urgency to test your 
system for Year 2000-compliance. You 
want to know if there's any source code or tool 
that Sun can provide to help you do the test- 
ing of all the software, including third party 
software, on all the Solaris boxes in your 
organization. 



Download it 

You can download an ABI (Application Binary 
Interface) tool, called y2000, free of cost from 
Sun's Web site at 

www.sun.com/y2000/abi-download.html 

Once you install the tool on your machine, 
you can test any binary executable for Year 
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2000-compliance. The tool tests any binary 
application and all the dynamic libraries it 
depends on for dates and reports if it passes 
the 2000 bug. Since it's an ABI tool, it can be 
used on any application, including third-party 
software that runs on Solaris. Be aware that to 
be absolutely sure of Year 2000-compliance, 
it's necessary to do a final human testing of 
the source code. This tool is definitely an im- 
portant starting point. As an example, using 



the y2000 tool to test the system's /bin/Is pro- 
gram results in the output in Listing A. 

Summary 

From the tool's output, it's quickly noticed 
that the y2000 tool searches for system library 
functions that could have potential Year 2000- 
compliance problems. You then must carefully 
ask yourself whether this potential problem is 
an actual problem. i& 



Listing A: Example output from y2000 tool 



This report singles out your application binaries that have 
dependencies on time related functions. 

Improper use oi these functions may cause a Year 2000 transition 
problem, but the dependency alone does not signal a Year 2000 error: 
manual source code checking and/or transition simulations are the 
next required steps. 

For additional information on developing Year 2000 safe applications, 
please see http://www.sun.com/y2000/devguide.html 

Function Dependency Report 



These time related functions are checked for: 



ascftime asctime asctime_r cftime ctime 

ct ime_r difftime getdate gettimeofday gmtime 

gmtime_r localtime I oca L t i me_r mktime settimeofday 

strftime strptime time tzset tzsetwall 
utime 



The following applications have a dependency on one or more 
of the above functions, your next step should be to examine 
them them more closely for Year 2000 transition problems: 



/bin/Is: 
cftime time 



1 of your applications out of a total 1 have a dependency 
on one or more of the time related functions listed above. 



The complete list of applications examined was: 
/bin/Is 



Note that shell scripts and SUID programs are skipped. 
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QUICK TIP 

Configuring network printers 
in Solaris 2.6 



by Asim Zuberi 

Configuring network printers in Solaris 
is simple and straightforward; all you 
need to know are the exact options of 
the commands. Not everyone has been using 
HP printers, which are very easy to configure 
with their jetadmin utility easily accessible 
from HP's Web site (www.hp.com/cposupport/ 
networking/software/ja245en.sol.html). There 
are lots of different kinds of printers out there 
that can be configured to a Solaris box via a 
network interface. 

The printers I dealt with were QMS, Tek- 
tronix, and HP. The HP printers can easily be 
configured with the jetadmin utility in the 
pkgadd format. All you need to do is down- 
load and run pkgadd to install it on your com- 
puter. Change the directory where you've 
downloaded the software, and then follow the 
steps below: 

• uncompress SOLd515.PKG.Z 

• pkgadd -d SOLd515.PKG all 

By default, the jetadmin software gets 
installed in the /opt directory. 

• cd /opt/hpnp 

• ./jetadmin 

The j etadmi n utility will display a menu on the 
screen. 

• Choose option # 1, for the configuration. 
Another menu will be displayed. 

• Choose option # 3 to add a printer to the 
local spooler. It will then prompt you to 
enter the IP address of the printer. 

• Enter the network printer name or IP 
address. 

• Once the IP address is entered, the j e t a dmi n 
utility will ping to the printer to check the 
network connection. Once it receives pack- 
ets back, it will display a list of suggested 
parameter values for the queue. 

• Choose option # 3, for the Queue Class. 



# Enter the class name, and then choose 0 to 
configure. Once it's configured, exit out of 
the jetadmin utility. 

Use the Ipstat -t command to check the status 
of the printer. If you'd like to make it the 
default printer, use the following command: 

# Ipadmin -d <class_name> 

Now, let's talk about other network print- 
ers. Solaris comes with the Ipadmin utility, 
which can be used to configure the printers. 
Here's how we'll use it: 

#lpadmin -p <printername> -o \ 
protocol=bsd,dest=<printdest> \ 
-T PS -I postscript -v /dev/null \ 
-I /usr/lib/lp/model/netstandard 

where printername is whatever you want to 
call it and printerdest and the printer's IP 
address. We now need to change the security 
for our device: 

# chmod 666 /dev/null 

# chown root:sys /dev/null 

You can then install the filters as follows: 

# cd /etc/lp/fd 

# for f i Iter in *.fd ; do 
*»name='basename Sfilter .fd~ 
*-lpf i Iter -f $name -F $f i Iter 
•■done 

# accept <printername> 

# enable <printername> 

Printers that can't be configured through BSD 
can be configured through TCP protocols with 
a specific port number instead. 

# Ipadmin -p <printername> -o \ 
protocol=tcp,dest=IP:910O \ 

-T PS -I postscript -v /dev/null \ 
-I /usr/lib/lp/model/netstandard 



IP is the IP address of your printer and 9200 is the port number. Don't forget 
to install the filters for the printers as described above, and then accept/ 
enable steps. Some other useful commands for the monitoring and manage- 
ment of print queues are 

# Ip -d <printer_name> (for the non default 
printer ) 

# Ip -n <file_name> (for multiple copies) 

# Ipadmin -d <printer_naine/class_name> (to 
make the printers as default) 

# Ipadmin -x <printer_name/class_name> (to 
remove the printers) 

# Ipstat -t (shows the status of the 
printers) 

# Ipstat -o (shows the jobs in the queue) 

# cancel ( to cancel the job) 

Hopefully this will get you started with your network printer configura- 
tion. You can also consult Sun's Answerbook for further information, 
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PING THE SOLARIS DUDE: SOLARIS Q&A 

by Robert Owen Thomas 

What version of BIND 
am i running? 

When taking over the administration of a 
Solaris box, it's often difficult to deter- 
mine what modifications the previous adminis- 
trator may have made. One such modification 
might be an upgrade of the name server dae- 
mon. Starting with Solaris 7, BIND 8.1.2 ships 
with all Solaris OS media. So it's a good idea to 
ascertain the version of BIND running on your 
nameserver. 

To do this, there is a simple trick that can 
be executed remotely using the ns Lookupf 1M) 
command: 

: pudge; nslookup 

Default Server: orc.research.cyraru.com 
Address: 192.168.0.254 

> server goblin 

Default server: goblin.research.cymru.com 
Address: 192.168.0.4 

> set class=chaos 

> set type=txt 

> version. bind 

Server : gob I i n . research . cymru .com 
Address: 192.168.0.4 



VERSION. BIND 



text = "8.1.2" 



And there you have it! The host goblin is 
running version 8.1.2 of BIND. If the server 
returns the message: 



*** goblin.research.cymru.com can't find 
version. bind: Server failed *** 

the version of BIND is release 4.9.5 or before. 
It also indicates that an upgrade of the BIND 
software is a must! 

What's a segmentation 
violation? 

A segmentation violation occurs when a pointer 
causes a memory reference to a segment that isn't 
in your address space. The kernel catches this 
attempt, and delivers a signal 11, or SIGSEGV, to 
your process. The default signal handling action 
is for the process to dump core and abort. The 
result, for you, is the abrupt end of your applica- 
tion and a (perhaps large) core file in a directory. 

In general, this translates to the de-referenc- 
ing of a bad pointer. The pointer may have 
been cleared already (through the f ree( ) call). 
The pointer may not have been assigned a 
value. The pointer may have been munged by 
a buffer overrun. Whatever the cause, the de- 
referencing of the pointer in question leads to 
a segmentation violation. 

As an aside, a segmentation violation isn't 
the same as a bus error. A bus error results in 
signal 10 (SIGBUS) being delivered to the 
process. This is usually the result of memory 
mis-alignment. ^ 
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